|
Author |
Thread Statistics | Show CCP posts - 24 post(s) |
|
CCP Prism X
C C P C C P Alliance
1386
|
Posted - 2013.12.03 15:36:00 -
[1] - Quote
I hear this guy listens to really stupid music (NSFW) which is probably indicative of his cognitive capacity. Probably not worth reading this! @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1387
|
Posted - 2013.12.03 15:50:00 -
[2] - Quote
Weaselior wrote:I have a concern. As you point out, this system means that ajacent systems tend to share the same physical hardware.
Doesn't that massively increase the chances of spillover tidi in nullsec battles: i.e. our staging system is overloaded so every system next to it is overloaded, massively increasing the pain in the ass to leave? Or the system the fight is in is massively overloaded, causing all systems around it to be massively overloaded as well making it a giant pain in the ass to get there?
If your fighting system is reinforced that shouldn't be an issue. This is also no more of an issue than it used to be, and now the TiDi you create in your staging systems will at least not be affecting players on the other side of the universe who have nothing to do with your pew pew.
And of course more nullsec nodes mean smaller pockets grouped together. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1387
|
Posted - 2013.12.03 15:57:00 -
[3] - Quote
There seems to be an image missing there right now and some problem with our CDN. Hopefully that will get resolved soon, but it's why there's no "visualization" for the splitting process. Just a table of stats. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1388
|
Posted - 2013.12.03 16:10:00 -
[4] - Quote
Wormholes do not need to be split by proximity. There's no sense of locality or proximity in WH space so they just get a very dumb but efficient method applied to them.
However I reduced the number of nodes running WH space. We've added a few back, apparently I should look at adding more before the weekend.
Sorry
Edit: Yes this distribution is remade every startup. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1388
|
Posted - 2013.12.03 16:52:00 -
[5] - Quote
GeeShizzle MacCloud wrote:Prism is it possible to put hard barriers for the pre-loader to stop connecting low and null sec systems together on one node? does this happen already?
when remapping systems from an overloaded node to another can a low and null system end up on the same node after tomorrow?
Currently Nullsec is split off from Empire Space (High and Low). That's a change we did a long time ago and was covered in my first draft of this blog. That was however about three times the size of this one, without pictures. So yeah we could so that, but we don't.
There's probably a case to be made for their load fingerprint being different due to different player behaviour. But as it stands they get lumped in with high-sec. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1402
|
Posted - 2013.12.04 15:23:00 -
[6] - Quote
I'M BACK!
I just want to clear up some confusion I'm seeing first: this is not meant to be the Holy Grail of lag reduction. This is a static load balancer. It is in no way the end-all solution to our load problems, it's an initial step that's required before anything further can be done. Optimizing underused resources is just wasted work. Well it's not wasted but it makes more sense to do it the other way around.
There is no load reduction going on here. Todays total load pressures with, or without, my code would be exactly the same. But with my code it will be more evenly distributed between nodes so that the probabily of a "wild" TiDi appearing have been reduced. TiDi in unreinforced systems with a massive fleet presence will still happen as it always has. We'll need to reduce the CPU footprint per user if we want to prevent Fleet TiDi (and my money is on that only increasing fleet sizes until TiDi becomes unbearable again).
So with that being said I'm going to try and answer some of the more frequent questions here.
"Adjacent systems" are bad for fleet movement / staging.
They always have been. They've actually been worse because the old system would be so aggressive on grouping systems of the same constellation together that it would, is so many cases that I was given time to work on this, chose to overload the node rather than split up the constellation.
If your Staging system is not reinforced, it's going to share its node with other systems. If the fleet is large enough to cause TiDi it will cause TiDi no matter what these other systems are. It's even possible that this staging system had enough load caused on it the day before to be reinforced on its own because its simply too loaded to share its node with any other system. But that will not help you with TiDi if your fleet is large enough to overload that node.
If you control the space around your staging system, you can now command people to stay out of those systems to avoid TiDi-ing your fleets staging system. I'd love to offer you a map of all node allocations so that you could discern wether or not that was needed.. but I'm certain people would metagame TiDi into existance through that.
I'm not sure what more to say. Nobody likes being TiDi'd. I'm not going to try to convince you to like it. But you'll be able to anticipate it now. And in case it's not clear: Nullsec and Empire do not run on the same nodes, and they have not for many years. Fleets in Nullsec do thus not cause TiDi in Empire, or vice versa. They cause TiDi in other Nullsec systems. This means that under the old system a staging system from the north could be allocated to a node already running a staging system from the south. That can no longer happen. That's something, yeah?
This does not help with sudden escalations.
Absolutely not. This is a static load balancer that balances system between nodes at server startup. I'm actually reading a paper from some people at the University of Bonn about predicting destinations based on previous system jumps. It's pretty interesting but my brain now hurts. But if we could hook something like that in we could detect staging systems forming. We'd still not detect sudden escalations. Sudden escalations need dynamic load balancing if we're to handle them gracefully under the current CPU per User fingerprint.
What about reinforced nodes?
Reinforcing nodes means that we, usually, move that system to a single node that is running nothing other than that system. We currently have three nodes on standby for the premapper to use according to reinforcement requests. Any system marked for reinforcement, at startup, is completely excluded from this premapping process. They're effectively marked as "Fleet Fight Systems" rather than "Null Sec Systems" and thus the "Null Sec System" load balancing method will not include them.
Dynamic V Static Load Balancing.
As I mentioned dynamic load balancing would solve a lot of our issues. But there are massive hurdles to that happening. I know that sounds weird to some people that work in an environment where it's easy as pie. But that's not our environment. We simply are not in the state where we can move a system between nodes without offlining everyone first. Would we like to be in that state: Ofcourse! But we're not. So we're stuck with the static approach until that changes.
So instead we run three other spare nodes (that are not the fleet fight nodes mentioned above) that we can allocate systems to if we need to separate them from a system with a sudden escalation fight in it.
Why is Empire split from Null?
Because Empire has a completely different load fingerprint than Nullsec (Crimewatch has been mentioned). Players in Empire also have a fairly different behaviour than players in Nullsec. Wormhole space is also seperated from these two groups of systems for the same reasons.
Sadly I think we have to few WH space nodes allocated now. If Empire runs smoothly today and tomorrow then I'm thinking I'll move one or two empire nodes into the WH space rotation for a better weekend experience.
And now I have to run to a meeting (probably already too late by the time I finish editing this on the forums). Sorry if these answers feel a bit abrupt, I was in a hurry. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1405
|
Posted - 2013.12.04 16:33:00 -
[7] - Quote
Vincent Athena wrote:So what efforts are being done to reduce the CPU footprint per user? I'm not privy to the plans of others, and have never been one to promise work on behalf of anyone other than myself. But I can totally tell you that I and RESCINDED just joined Gridlock to contribute to the Brain in a Box project.
That's of course a project outside our feature expansion release cycle. It's done when it's done. So I can't tell you anything more concrete than that. But feel free to berate me about it until then. I've got a hide as thick as my head. But I don't want you to do that to RESCINDED because that would be promising stuff on their behalf. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1405
|
Posted - 2013.12.04 16:41:00 -
[8] - Quote
Sentient Blade wrote:I've mentioned it elsewhere, but why are these machines not virtualised (or are they?) surely something like vMotion would be able to move high-use systems onto dedicated hardware without the need to pause anything.
Can I say "Because if it was that easy we'd have done it already" and leave it at that? I'd rather not try to elaborate on that very complex subject because I don't know everything about everything and I'd rather not accidentally lie to you. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1405
|
Posted - 2013.12.04 16:49:00 -
[9] - Quote
Vincent Athena wrote:Ive heard the words "Brain in a Box" quite a bit and seen vague descriptions of it having to do with preparing session change data on a separate node. But is there a full description somewhere? What it does, how much load it will remove, and so on?
It's meant to offload work currently being done by solar system nodes onto a different node, as well as to reduce the total amount of work needing to be done over and over again at certain time. How much it will offload is wholly dependant on the end result and how much of the current code is refactored into C.
I'm not sure if there are external sources with information for you. If I were at the office I could look something up for you or ask people. But I'm not. Perhaps I'll remember tomorrow! @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1405
|
Posted - 2013.12.04 17:08:00 -
[10] - Quote
Mioelnir wrote:@ Prism X: Over what timeframe are the cpu metrics used for premapping collected? Could a single escalated fight in a usually empty system not skew the metrics, forcing the system to become reinforced the next day? Are outlier datapoints stripped?
It's really not as sophisticated as it should be. Essentially we do know the load from hardware metrics but have to split that load between the many different systems running on the node. To do that we store the time it takes one simulation loop (IIRC, I am not in the office and don't feel like VPNing) to finish for a given system. Using the ratios between that we can estimate the % load that belongs to that system and we use that to evolve the value.
Outliers are factored in here, I'm not sure if it would be a good idea not to as they can be indicative of staging systems. Some outliers are however ignored like any system that's been moved to the Incursion load balancing group will be ignored while it is there (again IIRC). So a single escelated fight will skew the system for a bit, but it will then start regressing again.
This code has pretty much not been touched. I did a minor change to it when we first started noticing empire going all whack. It used to assign load by number of jumps and docks in a system but that's in no way related to the load an empire system will sustain. It might have worked for nullsec, but nullsec runs either cool or burning hot so trying to find the perfect balance there is an exercise in futility. We need fleet fighter prediction tools for that and the ones we currently have are riddled with false positives and have been turned off.
But yeah, this is the first step of many we need to do just for the static balancer. Changing the load evolution will probably be the next one. BiB comes first tho. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
|
CCP Prism X
C C P C C P Alliance
1431
|
Posted - 2013.12.11 15:42:00 -
[11] - Quote
I dislike leaving things hanging.
Wormhole Mass reduction is well outside of the scope of my work. Obviously mass should not be decreased on a denied jump but that should be bug reported through the proper channels so that it receives proper due diligence. Apparently this has been BR'd already so YAY!
Libras are not trustworthy. We're extremely dodgy characters who seem to be willing to adopt any side of everything for the sake of whatever deviant impulses drive us at that given moment.
I do believe everything else has been covered already in previous posts.
Fly safe... ..ish! @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
CCP Prism X
C C P C C P Alliance
1432
|
Posted - 2013.12.13 08:19:00 -
[12] - Quote
No. Distributing connected solar systems between different nodes will not make any load dissapear. It will just move it elsewhere. Perhaps I wasn't clear enough when I said that intra-jumps were more expensive: It's more expensive for that specific node as it will have to do both the tasks. Jumping between nodes does not magic away some part of the work that needs to be done. It's just done by two nodes. Remember: We do not want a fleet in the south to cause TiDi in the north.
And to reiterate this point: This is not about reducing load, it's about using currently available resources more efficiently. I've already started on another project, that will take some time, that is aimed at actually reducing load generated per client. It will also most probably end up in offloading much work from the system nodes to other nodes thus making any work put into "no adjacent systems on the same node" redundant. I'd rather just start working on an actual performance boost rather than attempting to satisfy unpredictable spike load with a static load balancer. @CCP_PrismX EVE Database Developer and Expert Ranter Member of a Different Team, every day. |
|
|
|
|